Overview 


ZFS on Mac OS X 


Developer Preview v1.0 


ZFS is a new filesystem from Sun Microsystems which has been ported by Apple to Mac OS X. The initial (10.5.0) release of Leopard will 
restrict ZFS to read-only, so no ZFS pools or filesystems can be modified or created. This Developer Preview will enable full read/write 
capability, which includes the creation/destruction of ZFS pools and filesystems. 


Warning 


This Developer Preview should not be used on production systems or systems with important data. As with the introduction of any new 
filesystem, please take extra care to perform and verify proper backups of your data. 


Feedback 


This ZFS Developer Preview has been made available to provide Apple with feedback from a broad range of users and environments. 
Please send this feedback to z/s@apple.com. For those with Radar access, submit bugs to the “ZFS” component. 


What is ZFS? 


Here’s a summary, taken directly from http:/Avww.opensolaris.org/os/community/zfs/whatis sometime in Feb. 2007: 
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ZES is a new kind of filesystem that provides simple administration, transactional semantics, end-to-end data integrity, and immense scalability. ZFS is not an 
is a fundamentally new approach to data management. We've blown away 20 years of obsolete 
assumptions, eliminated complexity at the source, and created a storage system that's actually a pleasure to use. 


y eliminates the concept of volumes and the associated problems of partitions, provisioning, wasted 
bandwidth and stranded storage. Thousands of filesystems can draw from a common storage pool, each one consuming only as much space as it actually 
combined I/O bandwidth of all devices in the pool is available to all filesystems at all times. 


All operations are copy-on-write transactions, so the on-disk state is always valid. There is no need to fsck(1M) a ZFS filesystem, ever. Every block is 
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ull backup, and any pair of snapshots can generate an incremental backup. 


Incremental backups are so efficient that they can be used for remote replication — e.g. to transmit an incremental update every 10 seconds. 


There are no arbitrary limits in ZFS. You can have as many files as you want; full 64-bit file offsets; unlimited links, directory entries, snapshots, and so on. 


ZFS provides built-in compression. In addition to reducing space usage by 2-3x, compression also reduces the amount of I/O by 2-3x. For this reason, enabling 
compression actually makes some workloads go faster. 


In addition to filesystems, ZFS storage pools can provide volumes for applications that need raw-device semantics. ZFS volumes can be used as swap devices, 
for example. And if you enable compression on a swap volume, you now have compressed virtual memory. 


ZFS administration is both simple and powerful. Please see the zpool(1M) and zfs(1M) man pages for more information — and be sure to check out the Getting 
Started section for a whirlwind tour. 


ZES is already quite snappy on most workloads — and we're just getting started. 


For more on ZFS, check out the following URLs: 
¢ — http:/Avww.opensolaris.org/os/community/zfs 


¢ — http:/Avww.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide 
¢ — http://opensolaris.org/os/community/zfs/docs/zfsadmin_0817.pdf 


Getting Started 


The underlying storage used by ZFS filesystems involves pools. A pool may consist of one or more whole disks or disk partitions. 
Basically, these whole disks and/or disk partitions can be combined in several different ways: dynamic striping, mirroring, or RAIDZ. 


In all cases, the disks need to use the GUID Partition Table (GPT) and ZFS typically works best when it owns the entire disk due in 
part to how conservative it is with the write cache. 


In the most simple example, let’s start by using a single drive for our “puddle” storage pool. In the following example the commands 
are issued as root. 


First, find out which device node to use with a "diskutil list" command. In the example below, I'm going to work with /dev/disk2 which 


currently has an APM (Apple Partition Map) label and an exsiting HFS filesystem. I'm going to replace the APM label with a GPT one and 
blow away the HFS "FW" filesystem. You should unmount any mounted filesystems on the target drive at this point. 


# diskutil list 


/dev/disk2 


#: type name size identifier 
0: Apple _partition_scheme *9.4 GB disk2 

1 Apple _partition_map 31.5 KB disk2s1 

2: Apple HFS FW 9.2 GB disk2s3 


Now I'm going to place a GPT label on that disk: 


# 

Started partitioning on disk disk2 

Creating partition map 

[ + 0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100% ] 
Finished partitioning on disk disk2 


/dev/disk2 
#: type name size identifier 
0: GUID _partition_scheme *9.4 GB disk2 
1: EFI 200.0 MB disk2s1 


2: ZFS 9.0 GB disk2s2 


And create our simple ZFS pool (named "puddle"): 


# zpool create puddle /dev/disk2s2 


And then check my work, noting that my new ZFS filesystem is available at Volumes/puddle: 


# zpool status puddle 
pool: puddle 
state: ONLINE 
scrub: none requested 


config: 
NAME STATE READ WRITE CKSUM 
puddle ONLINE 0 0 0 
disk2s2 ONLINE 0 0 0 


errors: No known data errors 


# df -hl /Volumes/puddle 
Filesystem Size Used Available Capacity Mounted on 
puddle 8.9Gi 19Ki 8.9Gi 1% /Volumes/puddle 


To create a mirror or a RAIDZ, take a look at the following example where "tank" (a mirrored pair) and "dozer" (a RAIDZ set) are the 
names of our pools. 


# zpool status tank 
pool: tank 
state: ONLINE 
scrub: none requested 


config: 
NAME STATE READ WRITE CKSUM 
tank ONLINE 0 0 0 
mirror ONLINE 0 0 0 
disk2s2 ONLINE 0 0 0 
disk3s2 ONLINE 0 0 0 


errors: No known data errors 


# 

# zpool status dozer 
pool: dozer 
state: ONLINE 
scrub: none requested 


config: 
NAME STATE READ WRITE CKSUM 
dozer ONLINE 0 0 0 
raidzl ONLINE 0 0 0 
disk4s2 ONLINE 0 0 0 
disk5s2 ONLINE 0 0 0 
disk6s2 ONLINE 0 0 0 


errors: No known data errors 


To add more storage to an existing pool, like “tank” from the example above, take a look at the following example. 


# 

# zpool status tank 
pool: tank 
state: ONLINE 
scrub: none requested 


config: 

NAME STATE READ WRITE CKSUM 

tank ONLINE 0 0 0 

mirror ONLINE 0 0 0 

disk2s2 ONLINE 0 0 0 

disk3s2 ONLINE 0 0 0 

mirror ONLINE 0 0 0 

disk7s2 ONLINE 0 0 0 

disk8s2 ONLINE 0 0 0 


errors: No known data errors 


What if I want to use an existing partition? 


If you have existing partitions you want to use, so long as the partition map on that drive is GPT and you've unmounted them and 
understand that pre-existing data on those partitions will be lost, you can use the "diskutil" command to change the label type in place. 


In the example below, I want to use the pre-existing HFS "blank" and Untitled 2" partitions for my "oddcouple" ZFS mirrored pool. 


Again, please note that with this example, any previous data on the HFS “blank” and “Untitled 2” partitions will 
(obviously) be overwritten! 


Note the Apple_ HFS partition type for the “blank” and “Untitled 2” partitions: 


# diskutil list 


/dev/disk0 

#: type name size identifier 

0: GUID_partition_scheme *149.1 GB diskO 

Ts EFI 200.0 MB disk0s1l 

2: Apple HFS Leopard 29.7 GB disk0s2 

3: Apple HFS Leopard9A376 29.7 GB disk0s3 

4: ZFS zfstest 29.7 GB disk0s4 

3 Apple HFS blank 29.7 GB disk0s5 

6: Apple HFS HFSJ_Boot 29.5 GB disk0s6 
/dev/diskl 

#: type name size identifier 

0: GUID _partition_scheme *153.4 GB diskl 

1: EFI 200.0 MB diskls1 

23 Apple HFS Untitled 1 30.7 GB diskls2 

BE Apple HFS Untitled 2 30.7 GB diskls3 

4: Apple HFS Leopard9A376 30.7 GB diskls4 

Se Apple HFS LaCieLeopard 30.7 GB diskls5 


6: Apple HFS Leopard9A377a 29.9 GB diskls6 


Change the partition type to ZFS and check your work: 


# 


[ + 0%..10%..20%..30%..40%..50%..60%..70%.. 


Finished erase on disk disk0s5 


# 


-90%..100% 


[ + 0%..10%..20%..30%..40%3..50%..60%..70%..80%..90%..1003% 
Finished erase on disk diskls3 


# diskutil list 


/dev/disk0 

#: type 

0: GUID _partition_scheme 

1: EFI 

23 Apple_HFS 

3: Apple_HFS 

4: ZFS 

3 ZFS 

6: Apple HFS 
/dev/diskl 

#: type 

0: GUID _partition_scheme 

1: EFI 

2: Apple HFS 

3: ZFS 

4: Apple_HFS 

53 Apple _HFS 

6: Apple_HFS 


name 


Leopard 
Leopard9A376 
zfstest 
HFSJ_Boot 


name 


Untitled 1 


Leopard9A376 
LaCieLeopard 
Leopard9A377a 


Now create the "oddcouple" pool and check your work: 


# zpool create oddcouple mirror disk0s5 disk1s3 


# zpool status oddcouple 
pool: oddcouple 
state: ONLINE 
scrub: none requested 


config: 
NAME STATE 
oddcouple ONLINE 
mirror ONLINE 


disk0s5 ONLINE 
diskls3 ONLINE 


errors: No known data errors 


READ WRITE CKSUM 


0 0 


0 0 
0 0 
0 0 


0 


0 
0 
0 


size 

*149.1 GB 
200.0 MB 
29.7 GB 
29.7 GB 
29.7 GB 
29.7 GB 
29.5 GB 


size 
*153.4 GB 
200.0 MB 
30.7 GB 
30.7 GB 
30.7 GB 
30.7 GB 
29.9 GB 


identifier 
disk0 
disk0s1 
disk0s2 
disk0s3 
disk0s4 
disk0s5 
disk0s6 


identifier 
diskl 
diskls1 
disk1s2 
disk1s3 
disk1s4 
diskl1s5 
disk1s6 


